Overview

Dataset statistics

Number of variables27
Number of observations91047
Missing cells173711
Missing cells (%)7.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory18.8 MiB
Average record size in memory216.0 B

Variable types

Numeric8
Categorical15
Unsupported4

Warnings

tipo_persona has constant value "Natural" Constant
entidad has a high cardinality: 151 distinct values High cardinality
tipo_persona is highly correlated with genero and 12 other fieldsHigh correlation
genero is highly correlated with tipo_personaHigh correlation
tiene_casa_propia is highly correlated with tipo_personaHigh correlation
estado_civil is highly correlated with tipo_personaHigh correlation
codeudor is highly correlated with tipo_personaHigh correlation
municipio_expedicion is highly correlated with tipo_persona and 2 other fieldsHigh correlation
tipo_identificacion is highly correlated with tipo_personaHigh correlation
forma_pago is highly correlated with tipo_personaHigh correlation
municipio_nacimiento is highly correlated with tipo_persona and 2 other fieldsHigh correlation
estado_final is highly correlated with tipo_personaHigh correlation
tipo_venta is highly correlated with tipo_personaHigh correlation
periodo_credito is highly correlated with tipo_personaHigh correlation
municipio_residencia is highly correlated with tipo_persona and 2 other fieldsHigh correlation
municipio_credito is highly correlated with tipo_personaHigh correlation
empresa has 3993 (4.4%) missing values Missing
cargo has 6020 (6.6%) missing values Missing
tiempo_servicio has 17808 (19.6%) missing values Missing
otros_ingresos_mensual has 69004 (75.8%) missing values Missing
otros_ingresos_concepto has 76885 (84.4%) missing values Missing
sueldo is highly skewed (γ1 = 55.32516627) Skewed
otros_ingresos_mensual is highly skewed (γ1 = 55.02864117) Skewed
valor_credito is highly skewed (γ1 = 48.6531017) Skewed
Row is uniformly distributed Uniform
Row has unique values Unique
empresa is an unsupported type, check if it needs cleaning or further analysis Unsupported
cargo is an unsupported type, check if it needs cleaning or further analysis Unsupported
tiempo_servicio is an unsupported type, check if it needs cleaning or further analysis Unsupported
otros_ingresos_concepto is an unsupported type, check if it needs cleaning or further analysis Unsupported
otros_ingresos_mensual has 15957 (17.5%) zeros Zeros

Reproduction

Analysis started2021-05-05 21:28:21.415092
Analysis finished2021-05-05 21:29:00.133553
Duration38.72 seconds
Software versionpandas-profiling v2.12.0
Download configurationconfig.yaml

Variables

Row
Real number (ℝ≥0)

UNIFORM
UNIQUE

Distinct91047
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean45524
Minimum1
Maximum91047
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size711.4 KiB

Quantile statistics

Minimum1
5-th percentile4553.3
Q122762.5
median45524
Q368285.5
95-th percentile86494.7
Maximum91047
Range91046
Interquartile range (IQR)45523

Descriptive statistics

Standard deviation26283.14932
Coefficient of variation (CV)0.5773470986
Kurtosis-1.2
Mean45524
Median Absolute Deviation (MAD)22762
Skewness0
Sum4144823628
Variance690803938
MonotonicityStrictly increasing
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20471
 
< 0.1%
436481
 
< 0.1%
272881
 
< 0.1%
252411
 
< 0.1%
313861
 
< 0.1%
293391
 
< 0.1%
191001
 
< 0.1%
170531
 
< 0.1%
231981
 
< 0.1%
211511
 
< 0.1%
Other values (91037)91037
> 99.9%
ValueCountFrequency (%)
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
ValueCountFrequency (%)
910471
< 0.1%
910461
< 0.1%
910451
< 0.1%
910441
< 0.1%
910431
< 0.1%

tipo_identificacion
Categorical

HIGH CORRELATION

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size711.4 KiB
Cedula de Ciudadania
90727 
Cedula de Extranjeria
 
262
Tarjeta de Identidad
 
23
Registro Civil
 
17
NIT
 
16

Length

Max length21
Median length20
Mean length19.99852823
Min length3

Characters and Unicode

Total characters1820806
Distinct characters24
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCedula de Ciudadania
2nd rowCedula de Ciudadania
3rd rowCedula de Ciudadania
4th rowCedula de Ciudadania
5th rowCedula de Ciudadania
ValueCountFrequency (%)
Cedula de Ciudadania90727
99.6%
Cedula de Extranjeria262
 
0.3%
Tarjeta de Identidad23
 
< 0.1%
Registro Civil17
 
< 0.1%
NIT16
 
< 0.1%
Pasaporte2
 
< 0.1%
Histogram of lengths of the category
ValueCountFrequency (%)
de91012
33.3%
cedula90989
33.3%
ciudadania90727
33.2%
extranjeria262
 
0.1%
tarjeta23
 
< 0.1%
identidad23
 
< 0.1%
registro17
 
< 0.1%
civil17
 
< 0.1%
nit16
 
< 0.1%
pasaporte2
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
a363767
20.0%
d363524
20.0%
e182328
10.0%
182041
10.0%
i181790
10.0%
C181733
10.0%
u181716
10.0%
n91012
 
5.0%
l91006
 
5.0%
r566
 
< 0.1%
Other values (14)1323
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1456657
80.0%
Uppercase Letter182108
 
10.0%
Space Separator182041
 
10.0%

Most frequent character per category

ValueCountFrequency (%)
a363767
25.0%
d363524
25.0%
e182328
12.5%
i181790
12.5%
u181716
12.5%
n91012
 
6.2%
l91006
 
6.2%
r566
 
< 0.1%
t327
 
< 0.1%
j285
 
< 0.1%
Other values (6)336
 
< 0.1%
ValueCountFrequency (%)
C181733
99.8%
E262
 
0.1%
I39
 
< 0.1%
T39
 
< 0.1%
R17
 
< 0.1%
N16
 
< 0.1%
P2
 
< 0.1%
ValueCountFrequency (%)
182041
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1638765
90.0%
Common182041
 
10.0%

Most frequent character per script

ValueCountFrequency (%)
a363767
22.2%
d363524
22.2%
e182328
11.1%
i181790
11.1%
C181733
11.1%
u181716
11.1%
n91012
 
5.6%
l91006
 
5.6%
r566
 
< 0.1%
t327
 
< 0.1%
Other values (13)996
 
0.1%
ValueCountFrequency (%)
182041
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1820806
100.0%

Most frequent character per block

ValueCountFrequency (%)
a363767
20.0%
d363524
20.0%
e182328
10.0%
182041
10.0%
i181790
10.0%
C181733
10.0%
u181716
10.0%
n91012
 
5.0%
l91006
 
5.0%
r566
 
< 0.1%
Other values (14)1323
 
0.1%

genero
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size711.4 KiB
Femenino
49288 
Masculino
41759 

Length

Max length9
Median length8
Mean length8.458653223
Min length8

Characters and Unicode

Total characters770135
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFemenino
2nd rowFemenino
3rd rowFemenino
4th rowFemenino
5th rowFemenino
ValueCountFrequency (%)
Femenino49288
54.1%
Masculino41759
45.9%
Histogram of lengths of the category
ValueCountFrequency (%)
femenino49288
54.1%
masculino41759
45.9%

Most occurring characters

ValueCountFrequency (%)
n140335
18.2%
e98576
12.8%
i91047
11.8%
o91047
11.8%
F49288
 
6.4%
m49288
 
6.4%
M41759
 
5.4%
a41759
 
5.4%
s41759
 
5.4%
c41759
 
5.4%
Other values (2)83518
10.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter679088
88.2%
Uppercase Letter91047
 
11.8%

Most frequent character per category

ValueCountFrequency (%)
n140335
20.7%
e98576
14.5%
i91047
13.4%
o91047
13.4%
m49288
 
7.3%
a41759
 
6.1%
s41759
 
6.1%
c41759
 
6.1%
u41759
 
6.1%
l41759
 
6.1%
ValueCountFrequency (%)
F49288
54.1%
M41759
45.9%

Most occurring scripts

ValueCountFrequency (%)
Latin770135
100.0%

Most frequent character per script

ValueCountFrequency (%)
n140335
18.2%
e98576
12.8%
i91047
11.8%
o91047
11.8%
F49288
 
6.4%
m49288
 
6.4%
M41759
 
5.4%
a41759
 
5.4%
s41759
 
5.4%
c41759
 
5.4%
Other values (2)83518
10.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII770135
100.0%

Most frequent character per block

ValueCountFrequency (%)
n140335
18.2%
e98576
12.8%
i91047
11.8%
o91047
11.8%
F49288
 
6.4%
m49288
 
6.4%
M41759
 
5.4%
a41759
 
5.4%
s41759
 
5.4%
c41759
 
5.4%
Other values (2)83518
10.8%

estado_civil
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size711.4 KiB
Union Libre
34793 
Casado
27732 
Soltero
27100 
Viudo
 
862
Divorciado
 
560

Length

Max length11
Median length7
Mean length8.223499951
Min length5

Characters and Unicode

Total characters748725
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUnion Libre
2nd rowUnion Libre
3rd rowSoltero
4th rowSoltero
5th rowSoltero
ValueCountFrequency (%)
Union Libre34793
38.2%
Casado27732
30.5%
Soltero27100
29.8%
Viudo862
 
0.9%
Divorciado560
 
0.6%
Histogram of lengths of the category
ValueCountFrequency (%)
libre34793
27.6%
union34793
27.6%
casado27732
22.0%
soltero27100
21.5%
viudo862
 
0.7%
divorciado560
 
0.4%

Most occurring characters

ValueCountFrequency (%)
o118707
15.9%
i71568
9.6%
n69586
9.3%
r62453
 
8.3%
e61893
 
8.3%
a56024
 
7.5%
U34793
 
4.6%
34793
 
4.6%
L34793
 
4.6%
b34793
 
4.6%
Other values (11)169322
22.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter588092
78.5%
Uppercase Letter125840
 
16.8%
Space Separator34793
 
4.6%

Most frequent character per category

ValueCountFrequency (%)
o118707
20.2%
i71568
12.2%
n69586
11.8%
r62453
10.6%
e61893
10.5%
a56024
9.5%
b34793
 
5.9%
d29154
 
5.0%
s27732
 
4.7%
l27100
 
4.6%
Other values (4)29082
 
4.9%
ValueCountFrequency (%)
U34793
27.6%
L34793
27.6%
C27732
22.0%
S27100
21.5%
V862
 
0.7%
D560
 
0.4%
ValueCountFrequency (%)
34793
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin713932
95.4%
Common34793
 
4.6%

Most frequent character per script

ValueCountFrequency (%)
o118707
16.6%
i71568
10.0%
n69586
9.7%
r62453
8.7%
e61893
8.7%
a56024
 
7.8%
U34793
 
4.9%
L34793
 
4.9%
b34793
 
4.9%
d29154
 
4.1%
Other values (10)140168
19.6%
ValueCountFrequency (%)
34793
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII748725
100.0%

Most frequent character per block

ValueCountFrequency (%)
o118707
15.9%
i71568
9.6%
n69586
9.3%
r62453
 
8.3%
e61893
 
8.3%
a56024
 
7.5%
U34793
 
4.6%
34793
 
4.6%
L34793
 
4.6%
b34793
 
4.6%
Other values (11)169322
22.6%

edad
Real number (ℝ≥0)

Distinct73
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean44.11591815
Minimum18
Maximum98
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size711.4 KiB

Quantile statistics

Minimum18
5-th percentile26
Q135
median43
Q352
95-th percentile64
Maximum98
Range80
Interquartile range (IQR)17

Descriptive statistics

Standard deviation11.92135914
Coefficient of variation (CV)0.2702280636
Kurtosis-0.3784568293
Mean44.11591815
Median Absolute Deviation (MAD)9
Skewness0.3400818195
Sum4016622
Variance142.1188037
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
443405
 
3.7%
403167
 
3.5%
412935
 
3.2%
452855
 
3.1%
352811
 
3.1%
492796
 
3.1%
462687
 
3.0%
362656
 
2.9%
312603
 
2.9%
392567
 
2.8%
Other values (63)62565
68.7%
ValueCountFrequency (%)
1812
 
< 0.1%
1966
 
0.1%
20215
0.2%
21291
0.3%
22461
0.5%
ValueCountFrequency (%)
982
< 0.1%
903
< 0.1%
891
 
< 0.1%
882
< 0.1%
871
 
< 0.1%

municipio_residencia
Categorical

HIGH CORRELATION

Distinct27
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size711.4 KiB
ARAUCA
39508 
TAME
24614 
ARAUQUITA
9258 
PUERTO RONDON
4662 
SARAVENA
4288 
Other values (22)
8717 

Length

Max length14
Median length6
Mean length6.560325985
Min length4

Characters and Unicode

Total characters597298
Distinct characters26
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st rowARAUQUITA
2nd rowARAUQUITA
3rd rowARAUQUITA
4th rowARAUQUITA
5th rowARAUQUITA
ValueCountFrequency (%)
ARAUCA39508
43.4%
TAME24614
27.0%
ARAUQUITA9258
 
10.2%
PUERTO RONDON4662
 
5.1%
SARAVENA4288
 
4.7%
PUERTO JORDAN4161
 
4.6%
FORTUL3653
 
4.0%
PANAMA335
 
0.4%
HATO COROZAL213
 
0.2%
CRAVO NORTE148
 
0.2%
Other values (17)207
 
0.2%
Histogram of lengths of the category
ValueCountFrequency (%)
arauca39508
39.4%
tame24614
24.6%
arauquita9258
 
9.2%
puerto8823
 
8.8%
rondon4662
 
4.7%
saravena4288
 
4.3%
jordan4161
 
4.2%
fortul3653
 
3.6%
panama335
 
0.3%
hato213
 
0.2%
Other values (24)726
 
0.7%

Most occurring characters

ValueCountFrequency (%)
A189898
31.8%
R74982
 
12.6%
U70628
 
11.8%
T46747
 
7.8%
C40042
 
6.7%
E37927
 
6.3%
O26928
 
4.5%
M24994
 
4.2%
N18280
 
3.1%
I9312
 
1.6%
Other values (16)57560
 
9.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter588103
98.5%
Space Separator9194
 
1.5%
Other Punctuation1
 
< 0.1%

Most frequent character per category

ValueCountFrequency (%)
A189898
32.3%
R74982
 
12.7%
U70628
 
12.0%
T46747
 
7.9%
C40042
 
6.8%
E37927
 
6.4%
O26928
 
4.6%
M24994
 
4.2%
N18280
 
3.1%
I9312
 
1.6%
Other values (14)48365
 
8.2%
ValueCountFrequency (%)
9194
100.0%
ValueCountFrequency (%)
.1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin588103
98.5%
Common9195
 
1.5%

Most frequent character per script

ValueCountFrequency (%)
A189898
32.3%
R74982
 
12.7%
U70628
 
12.0%
T46747
 
7.9%
C40042
 
6.8%
E37927
 
6.4%
O26928
 
4.6%
M24994
 
4.2%
N18280
 
3.1%
I9312
 
1.6%
Other values (14)48365
 
8.2%
ValueCountFrequency (%)
9194
> 99.9%
.1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII597297
> 99.9%
None1
 
< 0.1%

Most frequent character per block

ValueCountFrequency (%)
A189898
31.8%
R74982
 
12.6%
U70628
 
11.8%
T46747
 
7.8%
C40042
 
6.7%
E37927
 
6.3%
O26928
 
4.5%
M24994
 
4.2%
N18280
 
3.1%
I9312
 
1.6%
Other values (15)57559
 
9.6%
ValueCountFrequency (%)
Á1
100.0%

tipo_persona
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size711.4 KiB
Natural
91047 

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters637329
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNatural
2nd rowNatural
3rd rowNatural
4th rowNatural
5th rowNatural
ValueCountFrequency (%)
Natural91047
100.0%
Histogram of lengths of the category
ValueCountFrequency (%)
natural91047
100.0%

Most occurring characters

ValueCountFrequency (%)
a182094
28.6%
N91047
14.3%
t91047
14.3%
u91047
14.3%
r91047
14.3%
l91047
14.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter546282
85.7%
Uppercase Letter91047
 
14.3%

Most frequent character per category

ValueCountFrequency (%)
a182094
33.3%
t91047
16.7%
u91047
16.7%
r91047
16.7%
l91047
16.7%
ValueCountFrequency (%)
N91047
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin637329
100.0%

Most frequent character per script

ValueCountFrequency (%)
a182094
28.6%
N91047
14.3%
t91047
14.3%
u91047
14.3%
r91047
14.3%
l91047
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII637329
100.0%

Most frequent character per block

ValueCountFrequency (%)
a182094
28.6%
N91047
14.3%
t91047
14.3%
u91047
14.3%
r91047
14.3%
l91047
14.3%

empresa
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing3993
Missing (%)4.4%
Memory size711.4 KiB

municipio_nacimiento
Categorical

HIGH CORRELATION

Distinct27
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size711.4 KiB
ARAUCA
39508 
TAME
24614 
ARAUQUITA
9258 
PUERTO RONDON
4662 
SARAVENA
4288 
Other values (22)
8717 

Length

Max length14
Median length6
Mean length6.560325985
Min length4

Characters and Unicode

Total characters597298
Distinct characters26
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st rowARAUQUITA
2nd rowARAUQUITA
3rd rowARAUQUITA
4th rowARAUQUITA
5th rowARAUQUITA
ValueCountFrequency (%)
ARAUCA39508
43.4%
TAME24614
27.0%
ARAUQUITA9258
 
10.2%
PUERTO RONDON4662
 
5.1%
SARAVENA4288
 
4.7%
PUERTO JORDAN4161
 
4.6%
FORTUL3653
 
4.0%
PANAMA335
 
0.4%
HATO COROZAL213
 
0.2%
CRAVO NORTE148
 
0.2%
Other values (17)207
 
0.2%
Histogram of lengths of the category
ValueCountFrequency (%)
arauca39508
39.4%
tame24614
24.6%
arauquita9258
 
9.2%
puerto8823
 
8.8%
rondon4662
 
4.7%
saravena4288
 
4.3%
jordan4161
 
4.2%
fortul3653
 
3.6%
panama335
 
0.3%
hato213
 
0.2%
Other values (24)726
 
0.7%

Most occurring characters

ValueCountFrequency (%)
A189898
31.8%
R74982
 
12.6%
U70628
 
11.8%
T46747
 
7.8%
C40042
 
6.7%
E37927
 
6.3%
O26928
 
4.5%
M24994
 
4.2%
N18280
 
3.1%
I9312
 
1.6%
Other values (16)57560
 
9.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter588103
98.5%
Space Separator9194
 
1.5%
Other Punctuation1
 
< 0.1%

Most frequent character per category

ValueCountFrequency (%)
A189898
32.3%
R74982
 
12.7%
U70628
 
12.0%
T46747
 
7.9%
C40042
 
6.8%
E37927
 
6.4%
O26928
 
4.6%
M24994
 
4.2%
N18280
 
3.1%
I9312
 
1.6%
Other values (14)48365
 
8.2%
ValueCountFrequency (%)
9194
100.0%
ValueCountFrequency (%)
.1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin588103
98.5%
Common9195
 
1.5%

Most frequent character per script

ValueCountFrequency (%)
A189898
32.3%
R74982
 
12.7%
U70628
 
12.0%
T46747
 
7.9%
C40042
 
6.8%
E37927
 
6.4%
O26928
 
4.6%
M24994
 
4.2%
N18280
 
3.1%
I9312
 
1.6%
Other values (14)48365
 
8.2%
ValueCountFrequency (%)
9194
> 99.9%
.1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII597297
> 99.9%
None1
 
< 0.1%

Most frequent character per block

ValueCountFrequency (%)
A189898
31.8%
R74982
 
12.6%
U70628
 
11.8%
T46747
 
7.8%
C40042
 
6.7%
E37927
 
6.3%
O26928
 
4.5%
M24994
 
4.2%
N18280
 
3.1%
I9312
 
1.6%
Other values (15)57559
 
9.6%
ValueCountFrequency (%)
Á1
100.0%

municipio_expedicion
Categorical

HIGH CORRELATION

Distinct27
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size711.4 KiB
ARAUCA
39508 
TAME
24614 
ARAUQUITA
9258 
PUERTO RONDON
4662 
SARAVENA
4288 
Other values (22)
8717 

Length

Max length14
Median length6
Mean length6.560325985
Min length4

Characters and Unicode

Total characters597298
Distinct characters26
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st rowARAUQUITA
2nd rowARAUQUITA
3rd rowARAUQUITA
4th rowARAUQUITA
5th rowARAUQUITA
ValueCountFrequency (%)
ARAUCA39508
43.4%
TAME24614
27.0%
ARAUQUITA9258
 
10.2%
PUERTO RONDON4662
 
5.1%
SARAVENA4288
 
4.7%
PUERTO JORDAN4161
 
4.6%
FORTUL3653
 
4.0%
PANAMA335
 
0.4%
HATO COROZAL213
 
0.2%
CRAVO NORTE148
 
0.2%
Other values (17)207
 
0.2%
Histogram of lengths of the category
ValueCountFrequency (%)
arauca39508
39.4%
tame24614
24.6%
arauquita9258
 
9.2%
puerto8823
 
8.8%
rondon4662
 
4.7%
saravena4288
 
4.3%
jordan4161
 
4.2%
fortul3653
 
3.6%
panama335
 
0.3%
hato213
 
0.2%
Other values (24)726
 
0.7%

Most occurring characters

ValueCountFrequency (%)
A189898
31.8%
R74982
 
12.6%
U70628
 
11.8%
T46747
 
7.8%
C40042
 
6.7%
E37927
 
6.3%
O26928
 
4.5%
M24994
 
4.2%
N18280
 
3.1%
I9312
 
1.6%
Other values (16)57560
 
9.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter588103
98.5%
Space Separator9194
 
1.5%
Other Punctuation1
 
< 0.1%

Most frequent character per category

ValueCountFrequency (%)
A189898
32.3%
R74982
 
12.7%
U70628
 
12.0%
T46747
 
7.9%
C40042
 
6.8%
E37927
 
6.4%
O26928
 
4.6%
M24994
 
4.2%
N18280
 
3.1%
I9312
 
1.6%
Other values (14)48365
 
8.2%
ValueCountFrequency (%)
9194
100.0%
ValueCountFrequency (%)
.1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin588103
98.5%
Common9195
 
1.5%

Most frequent character per script

ValueCountFrequency (%)
A189898
32.3%
R74982
 
12.7%
U70628
 
12.0%
T46747
 
7.9%
C40042
 
6.8%
E37927
 
6.4%
O26928
 
4.6%
M24994
 
4.2%
N18280
 
3.1%
I9312
 
1.6%
Other values (14)48365
 
8.2%
ValueCountFrequency (%)
9194
> 99.9%
.1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII597297
> 99.9%
None1
 
< 0.1%

Most frequent character per block

ValueCountFrequency (%)
A189898
31.8%
R74982
 
12.6%
U70628
 
11.8%
T46747
 
7.8%
C40042
 
6.7%
E37927
 
6.3%
O26928
 
4.5%
M24994
 
4.2%
N18280
 
3.1%
I9312
 
1.6%
Other values (15)57559
 
9.6%
ValueCountFrequency (%)
Á1
100.0%

tiene_casa_propia
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size711.4 KiB
Si
64704 
No
26343 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters182094
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSi
2nd rowSi
3rd rowNo
4th rowNo
5th rowSi
ValueCountFrequency (%)
Si64704
71.1%
No26343
28.9%
Histogram of lengths of the category
ValueCountFrequency (%)
si64704
71.1%
no26343
28.9%

Most occurring characters

ValueCountFrequency (%)
S64704
35.5%
i64704
35.5%
N26343
14.5%
o26343
14.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter91047
50.0%
Lowercase Letter91047
50.0%

Most frequent character per category

ValueCountFrequency (%)
S64704
71.1%
N26343
28.9%
ValueCountFrequency (%)
i64704
71.1%
o26343
28.9%

Most occurring scripts

ValueCountFrequency (%)
Latin182094
100.0%

Most frequent character per script

ValueCountFrequency (%)
S64704
35.5%
i64704
35.5%
N26343
14.5%
o26343
14.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII182094
100.0%

Most frequent character per block

ValueCountFrequency (%)
S64704
35.5%
i64704
35.5%
N26343
14.5%
o26343
14.5%

cargo
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing6020
Missing (%)6.6%
Memory size711.4 KiB

sueldo
Real number (ℝ≥0)

SKEWED

Distinct683
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2859063.011
Minimum0
Maximum3204740882
Zeros356
Zeros (%)0.4%
Negative0
Negative (%)0.0%
Memory size711.4 KiB

Quantile statistics

Minimum0
5-th percentile700000
Q1908526
median1500000
Q32500000
95-th percentile4500000
Maximum3204740882
Range3204740882
Interquartile range (IQR)1591474

Descriptive statistics

Standard deviation46993410.9
Coefficient of variation (CV)16.4366475
Kurtosis3260.414092
Mean2859063.011
Median Absolute Deviation (MAD)591474
Skewness55.32516627
Sum2.6030911 × 1011
Variance2.208380668 × 1015
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
90852611026
 
12.1%
20000009249
 
10.2%
10000007694
 
8.5%
15000006764
 
7.4%
12000005502
 
6.0%
30000004903
 
5.4%
8000003043
 
3.3%
40000002992
 
3.3%
18000002854
 
3.1%
25000002743
 
3.0%
Other values (673)34277
37.6%
ValueCountFrequency (%)
0356
0.4%
152131
 
0.1%
8002
 
< 0.1%
300001
 
< 0.1%
930001
 
< 0.1%
ValueCountFrequency (%)
32047408823
 
< 0.1%
32034288402
 
< 0.1%
31743394613
 
< 0.1%
30168780814
 
< 0.1%
200000000016
< 0.1%

tiempo_servicio
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing17808
Missing (%)19.6%
Memory size711.4 KiB

otros_ingresos_mensual
Real number (ℝ≥0)

MISSING
SKEWED
ZEROS

Distinct96
Distinct (%)0.4%
Missing69004
Missing (%)75.8%
Infinite0
Infinite (%)0.0%
Mean1421585.435
Minimum0
Maximum3143089323
Zeros15957
Zeros (%)17.5%
Negative0
Negative (%)0.0%
Memory size711.4 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q3300000
95-th percentile2000000
Maximum3143089323
Range3143089323
Interquartile range (IQR)300000

Descriptive statistics

Standard deviation56026266.07
Coefficient of variation (CV)39.4111143
Kurtosis3050.806515
Mean1421585.435
Median Absolute Deviation (MAD)0
Skewness55.02864117
Sum3.133600775 × 1010
Variance3.13894249 × 1015
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
015957
 
17.5%
1000000894
 
1.0%
500000706
 
0.8%
400000506
 
0.6%
600000423
 
0.5%
2000000388
 
0.4%
800000337
 
0.4%
200000319
 
0.4%
300000279
 
0.3%
1500000234
 
0.3%
Other values (86)2000
 
2.2%
(Missing)69004
75.8%
ValueCountFrequency (%)
015957
17.5%
500003
 
< 0.1%
800002
 
< 0.1%
10000079
 
0.1%
1200007
 
< 0.1%
ValueCountFrequency (%)
31430893231
 
< 0.1%
31185342872
 
< 0.1%
31147707344
 
< 0.1%
10000000001
 
< 0.1%
6000000018
< 0.1%

otros_ingresos_concepto
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing76885
Missing (%)84.4%
Memory size711.4 KiB

municipio_credito
Categorical

HIGH CORRELATION

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size711.4 KiB
ARAUCA
40814 
TAME
25119 
ARAUQUITA
8799 
SARAVENA
5845 
PUERTO RONDON
 
3691
Other values (4)
6779 

Length

Max length13
Median length6
Mean length6.431820928
Min length4

Characters and Unicode

Total characters585598
Distinct characters19
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowARAUQUITA
2nd rowARAUQUITA
3rd rowARAUQUITA
4th rowARAUQUITA
5th rowARAUQUITA
ValueCountFrequency (%)
ARAUCA40814
44.8%
TAME25119
27.6%
ARAUQUITA8799
 
9.7%
SARAVENA5845
 
6.4%
PUERTO RONDON3691
 
4.1%
PUERTO JORDAN3660
 
4.0%
FORTUL3066
 
3.4%
PANAMA51
 
0.1%
CRAVO NORTE2
 
< 0.1%
Histogram of lengths of the category
ValueCountFrequency (%)
arauca40814
41.5%
tame25119
25.5%
arauquita8799
 
8.9%
puerto7351
 
7.5%
saravena5845
 
5.9%
rondon3691
 
3.8%
jordan3660
 
3.7%
fortul3066
 
3.1%
panama51
 
0.1%
norte2
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
A195308
33.4%
R73230
 
12.5%
U68829
 
11.8%
T44337
 
7.6%
C40816
 
7.0%
E38317
 
6.5%
M25170
 
4.3%
O21463
 
3.7%
N16940
 
2.9%
Q8799
 
1.5%
Other values (9)52389
 
8.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter578245
98.7%
Space Separator7353
 
1.3%

Most frequent character per category

ValueCountFrequency (%)
A195308
33.8%
R73230
 
12.7%
U68829
 
11.9%
T44337
 
7.7%
C40816
 
7.1%
E38317
 
6.6%
M25170
 
4.4%
O21463
 
3.7%
N16940
 
2.9%
Q8799
 
1.5%
Other values (8)45036
 
7.8%
ValueCountFrequency (%)
7353
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin578245
98.7%
Common7353
 
1.3%

Most frequent character per script

ValueCountFrequency (%)
A195308
33.8%
R73230
 
12.7%
U68829
 
11.9%
T44337
 
7.7%
C40816
 
7.1%
E38317
 
6.6%
M25170
 
4.4%
O21463
 
3.7%
N16940
 
2.9%
Q8799
 
1.5%
Other values (8)45036
 
7.8%
ValueCountFrequency (%)
7353
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII585598
100.0%

Most frequent character per block

ValueCountFrequency (%)
A195308
33.4%
R73230
 
12.5%
U68829
 
11.8%
T44337
 
7.6%
C40816
 
7.0%
E38317
 
6.5%
M25170
 
4.3%
O21463
 
3.7%
N16940
 
2.9%
Q8799
 
1.5%
Other values (9)52389
 
8.9%

codeudor
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size711.4 KiB
SIN CODEUDOR
80123 
CON CODEUDOR
10924 

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters1092564
Distinct characters10
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSIN CODEUDOR
2nd rowSIN CODEUDOR
3rd rowCON CODEUDOR
4th rowSIN CODEUDOR
5th rowSIN CODEUDOR
ValueCountFrequency (%)
SIN CODEUDOR80123
88.0%
CON CODEUDOR10924
 
12.0%
Histogram of lengths of the category
ValueCountFrequency (%)
codeudor91047
50.0%
sin80123
44.0%
con10924
 
6.0%

Most occurring characters

ValueCountFrequency (%)
O193018
17.7%
D182094
16.7%
C101971
9.3%
N91047
8.3%
91047
8.3%
E91047
8.3%
U91047
8.3%
R91047
8.3%
S80123
7.3%
I80123
7.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter1001517
91.7%
Space Separator91047
 
8.3%

Most frequent character per category

ValueCountFrequency (%)
O193018
19.3%
D182094
18.2%
C101971
10.2%
N91047
9.1%
E91047
9.1%
U91047
9.1%
R91047
9.1%
S80123
8.0%
I80123
8.0%
ValueCountFrequency (%)
91047
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1001517
91.7%
Common91047
 
8.3%

Most frequent character per script

ValueCountFrequency (%)
O193018
19.3%
D182094
18.2%
C101971
10.2%
N91047
9.1%
E91047
9.1%
U91047
9.1%
R91047
9.1%
S80123
8.0%
I80123
8.0%
ValueCountFrequency (%)
91047
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1092564
100.0%

Most frequent character per block

ValueCountFrequency (%)
O193018
17.7%
D182094
16.7%
C101971
9.3%
N91047
8.3%
91047
8.3%
E91047
8.3%
U91047
8.3%
R91047
8.3%
S80123
7.3%
I80123
7.3%

entidad
Categorical

HIGH CARDINALITY

Distinct151
Distinct (%)0.2%
Missing1
Missing (%)< 0.1%
Memory size711.4 KiB
INDEPENDIENTES ARAUCA
20352 
CONTADO
11305 
CREDITOS TAME
11002 
CONTADOS DE TAME
8118 
INDEP. ARAUQUITA
5015 
Other values (146)
35254 

Length

Max length35
Median length16
Mean length15.8191903
Min length6

Characters and Unicode

Total characters1440274
Distinct characters44
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15 ?
Unique (%)< 0.1%

Sample

1st rowINDEP. ARAUQUITA
2nd rowINDEP. ARAUQUITA
3rd rowINDEP. ARAUQUITA
4th rowINDEP. ARAUQUITA
5th rowINDEP. ARAUQUITA
ValueCountFrequency (%)
INDEPENDIENTES ARAUCA20352
22.4%
CONTADO11305
12.4%
CREDITOS TAME11002
12.1%
CONTADOS DE TAME8118
 
8.9%
INDEP. ARAUQUITA5015
 
5.5%
PUEBLO NUEVO3068
 
3.4%
FORTUL2950
 
3.2%
CONTADOS ARAUQUITA2919
 
3.2%
PUERTO RONDON2896
 
3.2%
FONDO EDUCATIVO REGIONAL2584
 
2.8%
Other values (141)20837
22.9%
Histogram of lengths of the category
ValueCountFrequency (%)
arauca24020
12.8%
tame22833
12.2%
independientes22644
12.1%
de15003
 
8.0%
contados13474
 
7.2%
creditos11753
 
6.3%
contado11305
 
6.0%
arauquita8721
 
4.6%
indep5015
 
2.7%
saravena4858
 
2.6%
Other values (179)48271
25.7%

Most occurring characters

ValueCountFrequency (%)
A188981
13.1%
E178020
12.4%
N126055
8.8%
D118282
8.2%
O114616
8.0%
T111900
7.8%
96888
 
6.7%
I92010
 
6.4%
R73792
 
5.1%
C70970
 
4.9%
Other values (34)268760
18.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter1330368
92.4%
Space Separator96888
 
6.7%
Other Punctuation10181
 
0.7%
Dash Punctuation1022
 
0.1%
Open Punctuation881
 
0.1%
Close Punctuation859
 
0.1%
Lowercase Letter57
 
< 0.1%
Decimal Number18
 
< 0.1%

Most frequent character per category

ValueCountFrequency (%)
A188981
14.2%
E178020
13.4%
N126055
9.5%
D118282
8.9%
O114616
8.6%
T111900
8.4%
I92010
6.9%
R73792
 
5.5%
C70970
 
5.3%
S69155
 
5.2%
Other values (18)186587
14.0%
ValueCountFrequency (%)
o22
38.6%
a15
26.3%
r5
 
8.8%
v5
 
8.8%
e5
 
8.8%
n5
 
8.8%
ValueCountFrequency (%)
.8285
81.4%
,1185
 
11.6%
"702
 
6.9%
%9
 
0.1%
ValueCountFrequency (%)
69
50.0%
09
50.0%
ValueCountFrequency (%)
96888
100.0%
ValueCountFrequency (%)
-1022
100.0%
ValueCountFrequency (%)
(881
100.0%
ValueCountFrequency (%)
)859
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1330425
92.4%
Common109849
 
7.6%

Most frequent character per script

ValueCountFrequency (%)
A188981
14.2%
E178020
13.4%
N126055
9.5%
D118282
8.9%
O114616
8.6%
T111900
8.4%
I92010
6.9%
R73792
 
5.5%
C70970
 
5.3%
S69155
 
5.2%
Other values (24)186644
14.0%
ValueCountFrequency (%)
96888
88.2%
.8285
 
7.5%
,1185
 
1.1%
-1022
 
0.9%
(881
 
0.8%
)859
 
0.8%
"702
 
0.6%
69
 
< 0.1%
09
 
< 0.1%
%9
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII1440033
> 99.9%
None241
 
< 0.1%

Most frequent character per block

ValueCountFrequency (%)
A188981
13.1%
E178020
12.4%
N126055
8.8%
D118282
8.2%
O114616
8.0%
T111900
7.8%
96888
 
6.7%
I92010
 
6.4%
R73792
 
5.1%
C70970
 
4.9%
Other values (32)268519
18.6%
ValueCountFrequency (%)
Ñ205
85.1%
Á36
 
14.9%

año_credito
Real number (ℝ≥0)

Distinct26
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2014.573846
Minimum1993
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size711.4 KiB

Quantile statistics

Minimum1993
5-th percentile2004
Q12012
median2016
Q32019
95-th percentile2020
Maximum2021
Range28
Interquartile range (IQR)7

Descriptive statistics

Standard deviation5.234178671
Coefficient of variation (CV)0.002598156766
Kurtosis0.5840257855
Mean2014.573846
Median Absolute Deviation (MAD)3
Skewness-1.068515684
Sum183420905
Variance27.39662636
MonotonicityNot monotonic
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
202012916
14.2%
201912395
13.6%
20187488
 
8.2%
20157058
 
7.8%
20176514
 
7.2%
20165967
 
6.6%
20145513
 
6.1%
20134956
 
5.4%
20124464
 
4.9%
20113847
 
4.2%
Other values (16)19929
21.9%
ValueCountFrequency (%)
19931
 
< 0.1%
1997206
 
0.2%
1998404
0.4%
1999514
0.6%
2000597
0.7%
ValueCountFrequency (%)
20212020
 
2.2%
202012916
14.2%
201912395
13.6%
20187488
8.2%
20176514
7.2%

mes_credito
Real number (ℝ≥0)

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.831076257
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size711.4 KiB

Quantile statistics

Minimum1
5-th percentile1
Q14
median7
Q310
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.48121566
Coefficient of variation (CV)0.5096145218
Kurtosis-1.218887257
Mean6.831076257
Median Absolute Deviation (MAD)3
Skewness-0.108812585
Sum621949
Variance12.11886247
MonotonicityNot monotonic
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
129588
10.5%
108806
9.7%
117988
8.8%
97859
8.6%
77722
8.5%
67512
8.3%
57443
8.2%
87060
7.8%
37048
7.7%
26931
7.6%
Other values (2)13090
14.4%
ValueCountFrequency (%)
16651
7.3%
26931
7.6%
37048
7.7%
46439
7.1%
57443
8.2%
ValueCountFrequency (%)
129588
10.5%
117988
8.8%
108806
9.7%
97859
8.6%
87060
7.8%

valor_credito
Real number (ℝ≥0)

SKEWED

Distinct6710
Distinct (%)7.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1292348.912
Minimum1
Maximum299250000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size711.4 KiB

Quantile statistics

Minimum1
5-th percentile60000
Q1350000
median881000
Q31570000
95-th percentile4214700
Maximum299250000
Range299249999
Interquartile range (IQR)1220000

Descriptive statistics

Standard deviation1982037.223
Coefficient of variation (CV)1.533670362
Kurtosis6434.10913
Mean1292348.912
Median Absolute Deviation (MAD)591000
Skewness48.6531017
Sum1.176644914 × 1011
Variance3.928471555 × 1012
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100000669
 
0.7%
160000641
 
0.7%
200000633
 
0.7%
1200000628
 
0.7%
180000623
 
0.7%
1000000614
 
0.7%
130000598
 
0.7%
50000595
 
0.7%
400000551
 
0.6%
600000546
 
0.6%
Other values (6700)84949
93.3%
ValueCountFrequency (%)
11
 
< 0.1%
10006
< 0.1%
12501
 
< 0.1%
13001
 
< 0.1%
15001
 
< 0.1%
ValueCountFrequency (%)
2992500001
< 0.1%
1837190001
< 0.1%
751600001
< 0.1%
418620001
< 0.1%
407840001
< 0.1%

estado_final
Categorical

HIGH CORRELATION

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size711.4 KiB
PAGADO VENCIDO
38859 
PAGADO ANTICIPADO
20290 
CONTADO
19522 
PAGADO A TIEMPO
5989 
DESCUENTO EN VENTA
5342 
Other values (3)
 
1045

Length

Max length18
Median length14
Mean length13.45540215
Min length7

Characters and Unicode

Total characters1225074
Distinct characters17
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPAGADO VENCIDO
2nd rowPAGADO VENCIDO
3rd rowPAGADO VENCIDO
4th rowPAGADO ANTICIPADO
5th rowPAGADO VENCIDO
ValueCountFrequency (%)
PAGADO VENCIDO38859
42.7%
PAGADO ANTICIPADO20290
22.3%
CONTADO19522
21.4%
PAGADO A TIEMPO5989
 
6.6%
DESCUENTO EN VENTA5342
 
5.9%
DEVOLUCION528
 
0.6%
CARTERA CASTIGADA368
 
0.4%
OTROS CIERRES149
 
0.2%
Histogram of lengths of the category
ValueCountFrequency (%)
pagado65138
37.6%
vencido38859
22.4%
anticipado20290
 
11.7%
contado19522
 
11.3%
tiempo5989
 
3.5%
a5989
 
3.5%
venta5342
 
3.1%
descuento5342
 
3.1%
en5342
 
3.1%
devolucion528
 
0.3%
Other values (4)1034
 
0.6%

Most occurring characters

ValueCountFrequency (%)
A203549
16.6%
O176016
14.4%
D150047
12.2%
N95225
7.8%
P91417
7.5%
I86473
7.1%
C85426
7.0%
82328
6.7%
E67410
 
5.5%
G65506
 
5.3%
Other values (7)121677
9.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter1142746
93.3%
Space Separator82328
 
6.7%

Most frequent character per category

ValueCountFrequency (%)
A203549
17.8%
O176016
15.4%
D150047
13.1%
N95225
8.3%
P91417
8.0%
I86473
7.6%
C85426
7.5%
E67410
 
5.9%
G65506
 
5.7%
T57370
 
5.0%
Other values (6)64307
 
5.6%
ValueCountFrequency (%)
82328
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1142746
93.3%
Common82328
 
6.7%

Most frequent character per script

ValueCountFrequency (%)
A203549
17.8%
O176016
15.4%
D150047
13.1%
N95225
8.3%
P91417
8.0%
I86473
7.6%
C85426
7.5%
E67410
 
5.9%
G65506
 
5.7%
T57370
 
5.0%
Other values (6)64307
 
5.6%
ValueCountFrequency (%)
82328
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1225074
100.0%

Most frequent character per block

ValueCountFrequency (%)
A203549
16.6%
O176016
14.4%
D150047
12.2%
N95225
7.8%
P91417
7.5%
I86473
7.1%
C85426
7.0%
82328
6.7%
E67410
 
5.5%
G65506
 
5.3%
Other values (7)121677
9.9%

tipo_venta
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size711.4 KiB
ELECTRODOMESTICOS
90236 
MOTOS
 
811

Length

Max length17
Median length17
Mean length16.89311015
Min length5

Characters and Unicode

Total characters1538067
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowELECTRODOMESTICOS
2nd rowELECTRODOMESTICOS
3rd rowELECTRODOMESTICOS
4th rowELECTRODOMESTICOS
5th rowELECTRODOMESTICOS
ValueCountFrequency (%)
ELECTRODOMESTICOS90236
99.1%
MOTOS811
 
0.9%
Histogram of lengths of the category
ValueCountFrequency (%)
electrodomesticos90236
99.1%
motos811
 
0.9%

Most occurring characters

ValueCountFrequency (%)
O272330
17.7%
E270708
17.6%
T181283
11.8%
S181283
11.8%
C180472
11.7%
M91047
 
5.9%
L90236
 
5.9%
R90236
 
5.9%
D90236
 
5.9%
I90236
 
5.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter1538067
100.0%

Most frequent character per category

ValueCountFrequency (%)
O272330
17.7%
E270708
17.6%
T181283
11.8%
S181283
11.8%
C180472
11.7%
M91047
 
5.9%
L90236
 
5.9%
R90236
 
5.9%
D90236
 
5.9%
I90236
 
5.9%

Most occurring scripts

ValueCountFrequency (%)
Latin1538067
100.0%

Most frequent character per script

ValueCountFrequency (%)
O272330
17.7%
E270708
17.6%
T181283
11.8%
S181283
11.8%
C180472
11.7%
M91047
 
5.9%
L90236
 
5.9%
R90236
 
5.9%
D90236
 
5.9%
I90236
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII1538067
100.0%

Most frequent character per block

ValueCountFrequency (%)
O272330
17.7%
E270708
17.6%
T181283
11.8%
S181283
11.8%
C180472
11.7%
M91047
 
5.9%
L90236
 
5.9%
R90236
 
5.9%
D90236
 
5.9%
I90236
 
5.9%

periodo_credito
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size711.4 KiB
MENSUAL(ES)
73386 
DIARIA(S)
16634 
SEMANAL(ES)
 
775
QUINCENAL(ES)
 
252

Length

Max length13
Median length11
Mean length10.6401419
Min length9

Characters and Unicode

Total characters968753
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMENSUAL(ES)
2nd rowMENSUAL(ES)
3rd rowMENSUAL(ES)
4th rowMENSUAL(ES)
5th rowMENSUAL(ES)
ValueCountFrequency (%)
MENSUAL(ES)73386
80.6%
DIARIA(S)16634
 
18.3%
SEMANAL(ES)775
 
0.9%
QUINCENAL(ES)252
 
0.3%
Histogram of lengths of the category
ValueCountFrequency (%)
mensual(es73386
80.6%
diaria(s16634
 
18.3%
semanal(es775
 
0.9%
quincenal(es252
 
0.3%

Most occurring characters

ValueCountFrequency (%)
S165208
17.1%
E148826
15.4%
A108456
11.2%
(91047
9.4%
)91047
9.4%
N74665
7.7%
L74413
7.7%
M74161
7.7%
U73638
7.6%
I33520
 
3.5%
Other values (4)33772
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter786659
81.2%
Open Punctuation91047
 
9.4%
Close Punctuation91047
 
9.4%

Most frequent character per category

ValueCountFrequency (%)
S165208
21.0%
E148826
18.9%
A108456
13.8%
N74665
9.5%
L74413
9.5%
M74161
9.4%
U73638
9.4%
I33520
 
4.3%
D16634
 
2.1%
R16634
 
2.1%
Other values (2)504
 
0.1%
ValueCountFrequency (%)
(91047
100.0%
ValueCountFrequency (%)
)91047
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin786659
81.2%
Common182094
 
18.8%

Most frequent character per script

ValueCountFrequency (%)
S165208
21.0%
E148826
18.9%
A108456
13.8%
N74665
9.5%
L74413
9.5%
M74161
9.4%
U73638
9.4%
I33520
 
4.3%
D16634
 
2.1%
R16634
 
2.1%
Other values (2)504
 
0.1%
ValueCountFrequency (%)
(91047
50.0%
)91047
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII968753
100.0%

Most frequent character per block

ValueCountFrequency (%)
S165208
17.1%
E148826
15.4%
A108456
11.2%
(91047
9.4%
)91047
9.4%
N74665
7.7%
L74413
7.7%
M74161
7.7%
U73638
7.6%
I33520
 
3.5%
Other values (4)33772
 
3.5%

cuotas
Real number (ℝ≥0)

Distinct320
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.934308654
Minimum0
Maximum959
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size711.4 KiB

Quantile statistics

Minimum0
5-th percentile1
Q11
median3
Q39
95-th percentile15
Maximum959
Range959
Interquartile range (IQR)8

Descriptive statistics

Standard deviation23.38240162
Coefficient of variation (CV)2.946999246
Kurtosis212.8847377
Mean7.934308654
Median Absolute Deviation (MAD)2
Skewness11.09327379
Sum722395
Variance546.7367056
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
140618
44.6%
108973
 
9.9%
58546
 
9.4%
34197
 
4.6%
24157
 
4.6%
123881
 
4.3%
63654
 
4.0%
42897
 
3.2%
142400
 
2.6%
92365
 
2.6%
Other values (310)9359
 
10.3%
ValueCountFrequency (%)
01
 
< 0.1%
140618
44.6%
24157
 
4.6%
34197
 
4.6%
42897
 
3.2%
ValueCountFrequency (%)
9591
< 0.1%
9202
< 0.1%
9091
< 0.1%
8011
< 0.1%
5461
< 0.1%

forma_pago
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size711.4 KiB
CRÉDITO
69061 
CONTADO
19522 
LIBRANZA
 
2464

Length

Max length8
Median length7
Mean length7.027062946
Min length7

Characters and Unicode

Total characters639793
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCRÉDITO
2nd rowCRÉDITO
3rd rowCRÉDITO
4th rowCRÉDITO
5th rowCRÉDITO
ValueCountFrequency (%)
CRÉDITO69061
75.9%
CONTADO19522
 
21.4%
LIBRANZA2464
 
2.7%
Histogram of lengths of the category
ValueCountFrequency (%)
crédito69061
75.9%
contado19522
 
21.4%
libranza2464
 
2.7%

Most occurring characters

ValueCountFrequency (%)
O108105
16.9%
C88583
13.8%
D88583
13.8%
T88583
13.8%
R71525
11.2%
I71525
11.2%
É69061
10.8%
A24450
 
3.8%
N21986
 
3.4%
L2464
 
0.4%
Other values (2)4928
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter639793
100.0%

Most frequent character per category

ValueCountFrequency (%)
O108105
16.9%
C88583
13.8%
D88583
13.8%
T88583
13.8%
R71525
11.2%
I71525
11.2%
É69061
10.8%
A24450
 
3.8%
N21986
 
3.4%
L2464
 
0.4%
Other values (2)4928
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
Latin639793
100.0%

Most frequent character per script

ValueCountFrequency (%)
O108105
16.9%
C88583
13.8%
D88583
13.8%
T88583
13.8%
R71525
11.2%
I71525
11.2%
É69061
10.8%
A24450
 
3.8%
N21986
 
3.4%
L2464
 
0.4%
Other values (2)4928
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII570732
89.2%
None69061
 
10.8%

Most frequent character per block

ValueCountFrequency (%)
O108105
18.9%
C88583
15.5%
D88583
15.5%
T88583
15.5%
R71525
12.5%
I71525
12.5%
A24450
 
4.3%
N21986
 
3.9%
L2464
 
0.4%
B2464
 
0.4%
ValueCountFrequency (%)
É69061
100.0%

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

Rowtipo_identificaciongeneroestado_civiledadmunicipio_residenciatipo_personaempresamunicipio_nacimientomunicipio_expediciontiene_casa_propiacargosueldotiempo_serviciootros_ingresos_mensualotros_ingresos_conceptomunicipio_creditocodeudorentidadaño_creditomes_creditovalor_creditoestado_finaltipo_ventaperiodo_creditocuotasforma_pago
01Cedula de CiudadaniaFemeninoUnion Libre32ARAUQUITANaturalINDEPENDIENTEARAUQUITAARAUQUITASiCOMERCIANTE4000002 AÑOSNaNNaNARAUQUITASIN CODEUDORINDEP. ARAUQUITA202031020000PAGADO VENCIDOELECTRODOMESTICOSMENSUAL(ES)12CRÉDITO
12Cedula de CiudadaniaFemeninoUnion Libre41ARAUQUITANaturalINDEPENDIENTEARAUQUITAARAUQUITASiCOMERCIANTE20000005 AÑOSNaNNaNARAUQUITASIN CODEUDORINDEP. ARAUQUITA20203680000PAGADO VENCIDOELECTRODOMESTICOSMENSUAL(ES)5CRÉDITO
23Cedula de CiudadaniaFemeninoSoltero23ARAUQUITANaturalNaNARAUQUITAARAUQUITANoESTILISTA7000003 AÑOSNaNNaNARAUQUITACON CODEUDORINDEP. ARAUQUITA202031155000PAGADO VENCIDOELECTRODOMESTICOSMENSUAL(ES)5CRÉDITO
34Cedula de CiudadaniaFemeninoSoltero39ARAUQUITANaturalFISCALIAARAUQUITAARAUQUITANoASISTENTE DE FISCAL UNO43000003 AÑOSNaNNaNARAUQUITASIN CODEUDORINDEP. ARAUQUITA202023678000PAGADO ANTICIPADOELECTRODOMESTICOSMENSUAL(ES)5CRÉDITO
45Cedula de CiudadaniaFemeninoSoltero65ARAUQUITANaturalINDEPENDIENTEARAUQUITAARAUQUITASiRESTAURANTE DOÑA RITA1500000NaNNaNNaNARAUQUITASIN CODEUDORINDEP. ARAUQUITA202021920000PAGADO VENCIDOELECTRODOMESTICOSMENSUAL(ES)5CRÉDITO
56Cedula de CiudadaniaFemeninoSoltero55ARAUQUITANaturalSEAD ARAUCAARAUQUITAARAUQUITASiDOCENTE350000028 AÑOSNaNNaNARAUQUITASIN CODEUDORINDEP. ARAUQUITA20202515000PAGADO ANTICIPADOELECTRODOMESTICOSMENSUAL(ES)5CRÉDITO
67Cedula de CiudadaniaFemeninoUnion Libre35ARAUQUITANaturalSENAARAUQUITAARAUQUITASiAUXILIAR DE ENFERMERIA18000001 AÑONaNNaNARAUQUITASIN CODEUDORINDEP. ARAUQUITA202021050000PAGADO VENCIDOELECTRODOMESTICOSMENSUAL(ES)1CRÉDITO
78Cedula de CiudadaniaFemeninoSoltero46ARAUQUITANaturalINDEPENDIENTEARAUQUITAARAUQUITANoCOMERCIANTE60000010 AÑOSNaNNaNARAUQUITACON CODEUDORINDEP. ARAUQUITA202031614000PAGADO ANTICIPADOELECTRODOMESTICOSMENSUAL(ES)12CRÉDITO
89Cedula de CiudadaniaFemeninoCasado29ARAUQUITANaturalINDEPENDIENTEARAUQUITAARAUQUITASiINDEPENDIENTE15000005 AÑOSNaNNaNARAUQUITASIN CODEUDORINDEP. ARAUQUITA20203860000DEVOLUCIONELECTRODOMESTICOSMENSUAL(ES)5CRÉDITO
910Cedula de CiudadaniaFemeninoSoltero35ARAUQUITANaturalFEDECACAOARAUQUITAARAUQUITANoTECNICA DE CAMPO950000NaNNaNNaNARAUQUITASIN CODEUDORINDEP. ARAUQUITA20203545000PAGADO VENCIDOELECTRODOMESTICOSMENSUAL(ES)5CRÉDITO

Last rows

Rowtipo_identificaciongeneroestado_civiledadmunicipio_residenciatipo_personaempresamunicipio_nacimientomunicipio_expediciontiene_casa_propiacargosueldotiempo_serviciootros_ingresos_mensualotros_ingresos_conceptomunicipio_creditocodeudorentidadaño_creditomes_creditovalor_creditoestado_finaltipo_ventaperiodo_creditocuotasforma_pago
9103791038Cedula de CiudadaniaMasculinoUnion Libre49ARAUCANaturalINDEPENDIENTEARAUCAARAUCASiOPERADOR MAQUINARIA PESADA520000015 AÑOSNaNNaNARAUCASIN CODEUDORALCALDIA DE ARAUCA199712680400DESCUENTO EN VENTAELECTRODOMESTICOSMENSUAL(ES)12LIBRANZA
9103891039Cedula de CiudadaniaMasculinoCasado60ARAUCANaturalALCALDIA DE ARAUCAARAUCAARAUCASiBOMBEROS90852632 AÑOSNaNNaNARAUCASIN CODEUDORALCALDIA DE ARAUCA201161912000DESCUENTO EN VENTAELECTRODOMESTICOSMENSUAL(ES)15CRÉDITO
9103991040Cedula de CiudadaniaMasculinoCasado60ARAUCANaturalALCALDIA DE ARAUCAARAUCAARAUCASiBOMBEROS90852632 AÑOSNaNNaNARAUCASIN CODEUDORALCALDIA DE ARAUCA201021800000PAGADO VENCIDOELECTRODOMESTICOSMENSUAL(ES)15CRÉDITO
9104091041Cedula de CiudadaniaMasculinoCasado60ARAUCANaturalALCALDIA DE ARAUCAARAUCAARAUCASiBOMBEROS90852632 AÑOSNaNNaNARAUCASIN CODEUDORALCALDIA DE ARAUCA200871301000DESCUENTO EN VENTAELECTRODOMESTICOSMENSUAL(ES)17LIBRANZA
9104191042Cedula de CiudadaniaMasculinoSoltero47SARAVENANaturalALCALDIA DE SARAVENASARAVENASARAVENASiINSPECTOR DE IMPUESTOS15000002 AÑOSNaNNaNARAUCASIN CODEUDORNaN20093899000PAGADO VENCIDOELECTRODOMESTICOSMENSUAL(ES)6CRÉDITO
9104291043Cedula de CiudadaniaMasculinoSoltero35TAMENaturalINDEPENDENCETAMETAMENoDICTA CAPACITACIONES8500003 MESESNaNNaNARAUCACON CODEUDORSIN ENTIDAD201791350000PAGADO ANTICIPADOELECTRODOMESTICOSMENSUAL(ES)1CRÉDITO
9104391044Cedula de CiudadaniaMasculinoCasado39ARAUCANaturalINDEPENDIENTEARAUCAARAUCANoCOMUNICADOR EN CAMPO35000005 AÑOSNaNNaNARAUCASIN CODEUDORSIN ENTIDAD2017113250000PAGADO VENCIDOELECTRODOMESTICOSMENSUAL(ES)4CRÉDITO
9104491045Cedula de CiudadaniaMasculinoUnion Libre31TAMENaturalINDEPENDIENTETAMETAMENoDELICARNES DUEÑO15000002 AÑOSNaNNaNTAMESIN CODEUDORSIN ENTIDAD2017123410000PAGADO VENCIDOELECTRODOMESTICOSDIARIA(S)153CRÉDITO
9104591046Cedula de CiudadaniaMasculinoCasado51ARAUCANaturalCORPORINOQUIAARAUCAARAUCASiPERIODISTA380000012 MESES0.0NaNARAUCASIN CODEUDORALCALDIA DE ARAUCA200842690000PAGADO VENCIDOELECTRODOMESTICOSMENSUAL(ES)9LIBRANZA
9104691047Cedula de CiudadaniaMasculinoCasado51ARAUCANaturalCORPORINOQUIAARAUCAARAUCASiPERIODISTA380000012 MESES0.0NaNARAUCASIN CODEUDORALCALDIA DE ARAUCA200862628000DESCUENTO EN VENTAELECTRODOMESTICOSMENSUAL(ES)11LIBRANZA